AITopics | model misspecification

Iterative Missing Data Imputation with Model Form Adaptation and Non-Missing Feature Supervision

Neural Information Processing SystemsJun-15-2026, 12:07:24 GMT

Iterative imputation is a prevalent method for missing data imputation, where each feature is imputed iteratively by treating it as a target variable estimated from all other features. However, iterative imputation method suffers from two principal limitations: it imposes a single parametric model form to impute all features, neglecting the potential for optimal models to vary among features, which risks model misspecification; and it assumes every feature contains missing values, overlooking the potential presence of non-missing features, termed as oracle features, which are informative for imputation. To address these limitations, we propose kernel point imputation (KPI), a bi-level optimization framework for iterative missing data imputation. At the inner level, KPI adaptively learns the optimal model form for each feature within a reproducing kernel Hilbert space, addressing limitation . At the outer level, KPI utilizes oracle features as supervisory signals to iteratively refine the imputations, addressing limitation . Experiments demonstrate that KPI outperforms competitive imputation methods. Code is available at https://github.com/FMLYD/kpi.git.

data quality, imputation, machine learning, (18 more...)

Neural Information Processing Systems

Country: Asia > China (0.46)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Information Technology (0.67)

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Robust Sequential Experimental Design for A/B Testing

Wen, Qianglin, Wu, Xiangkun, Shi, Chengchun, Li, Ting, Tang, Niansheng, Zhang, Yingying, Zhu, Hongtu

arXiv.org Machine LearningMay-14-2026

Experimental design has emerged as a powerful approach for improving the sample efficiency of A/B testing, yet existing designs rely critically on correctly specified models. We study robust sequential experimental design under model misspecification and develop a unified framework that covers both contextual bandit and dynamic settings. Theoretically, we prove that our design bounds the worst-case mean squared error of the estimated treatment effect. Empirically, we demonstrate the effectiveness of the proposed approach using synthetic and real-world datasets from a leading technology company.

artificial intelligence, machine learning, robust sequential experimental design, (16 more...)

arXiv.org Machine Learning

2605.12899

Country:

Asia > China (0.46)
North America > United States (0.45)

Genre: Research Report > Experimental Study (1.00)

Industry:

Information Technology (0.68)
Transportation (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.67)
(2 more...)

Add feedback

16c5b4102a6b6eb061e502ce6736ad8a-Paper-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 07:25:15 GMT

artificial intelligence, machine learning, statistics, (17 more...)

Neural Information Processing Systems

Country: Europe (0.28)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)

Add feedback

Misspecified Gaussian Process Bandit Optimization

Neural Information Processing SystemsApr-24-2026, 21:47:33 GMT

We consider the problem of optimizing a black-box function based on noisy bandit feedback. Kernelized bandit algorithms have shown strong empirical and theoretical performance for this problem. They heavily rely on the assumption that the model is well-specified, however, and can fail without it. Instead, we introduce a misspecified kernelized bandit setting where the unknown function can be -uniformly approximated by a function with a bounded norm in some Reproducing Kernel Hilbert Space (RKHS).

artificial intelligence, data mining, machine learning, (18 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Online Clustering of Bandits with Misspecified User Models

Neural Information Processing SystemsApr-24-2026, 16:33:03 GMT

The contextual linear bandit is an important online learning problem where given arm features, a learning agent selects an arm at each round to maximize the cumulative rewards in the long run. A line of works, called the clustering of bandits (CB), utilize the collaborative effect over user preferences and have shown significant improvements over classic linear bandit algorithms. However, existing CB algorithms require well-specified linear user models and can fail when this critical assumption does not hold. Whether robust CB algorithms can be designed for more practical scenarios with misspecified user models remains an open problem. In this paper, we are the first to present the important problem of clustering of bandits with misspecified user models (CBMUM), where the expected rewards in user models can be perturbed away from perfect linear models. We devise two robust CB algorithms, RCLUMB and RSCLUMB (representing the learned clustering structure with dynamic graph and sets, respectively), that can accommodate the inaccurate user preference estimations and erroneous clustering caused by model misspecifications. We prove regret upper bounds of O(ϵ T mdlogT + d mT logT) for our algorithms under milder assumptions than previous CB works (notably, we move past a restrictive technical assumption on the distribution of the arms), which match the lower bound asymptotically in T up to logarithmic factors, and also match the state-of-the-art results in several degenerate cases. The techniques in proving the regret caused by misclustering users are quite general and may be of independent interest. Experiments on both synthetic and real-world data show our outperformance over previous algorithms.

artificial intelligence, data mining, machine learning, (19 more...)

Neural Information Processing Systems

Country: Asia > China (0.28)

Industry: Education (0.54)

Technology: